Digital Peacebuilding Expo: Talk to the City
https://www.youtube.com/watch?v=zDGl_RwWSCQ
claude.iconThis document is a transcript of a presentation by Britney and Deger, co-founders of AI Objectives Institute. The main content is as follows Talk to the City, an open-source language model to improve collective discussion and decision-making. Working with Silent Cry, an organization that helps people released from prison and their families reintegrate into society, Britney interviewed 12 people and had GPT-4 study the transcriptions. The content was checked with all participants to ensure accuracy. The final error rate was less than 3%.
In the future, the tool will be improved so that anyone can use it easily, and linkage with WhatsApp and other systems will be considered. It was also used on a trial basis for data from a citizen's meeting in Taiwan. Data privacy and use will be explained to participants as the situation allows. In the future, we would like to use an open source model and allow the community to manage the data.
This is an initiative to visualize the voices of the disabled and socially vulnerable and use AI to solve their problems. On the other hand, careful handling of data is likely to be required. nishio.iconMore details from the interview to "less than 3% final error".
claude.icon
After training the GPT-4 on the interview transcriptions, the AI generated approximately 500 claims (CLAIMS). These claims were generated by the model based on quotes from the interviews.
Britney then reviewed all 500 generated claims with each and every one of the interviewees. This was done to allow the participants themselves to determine if the AI-generated content was correct. In most cases, participants acknowledged that the claims were correct, but the initial error rate was about 5%.
So that's 25 cases.nishio.icon
The errors varied: the categories were wrong, lacked nuance, or were completely incorrect. For example, in one case the AI created a category with the inappropriate term "ex-convict". The participant did not use such a term in the interview, so the model generated it itself.
Britney et al. iterated over and over again until they fixed the correctable errors and ultimately reduced the uncorrectable errors to less than 3%. It took considerable time and effort, but eventually both Britney and the participants felt that the reports accurately represented the community.
Thus, it appears that the AI-generated content was carefully checked and repeatedly revised by humans to obtain highly accurate results. On the other hand, it also suggests that complete automation is difficult and that checking by the human eye is essential.
---